智能论文笔记

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Body Composition Assessment with Limited Field-of-view Computed Tomography: A Semantic Image Extension Perspective

Kaiwen Xu , Thomas Li , Mirza S. Khan , Riqiang Gao , Sanja L. Antic , Yuankai Huo , Kim L. Sandler , Fabien Maldonado , Bennett A. Landman

分类：计算机视觉

2022-07-13

肺部以外的视野（FOV）组织截断在常规的肺筛查计算机断层扫描（CT）中很常见。这对机会性CT的身体组成（BC）评估构成了局限性，因为缺少关键的解剖结构。传统上，扩展CT的FOV被认为是使用有限数据的CT重建问题。但是，这种方法依赖于应用程序中可能无法使用的投影域数据。在这项工作中，我们从语义图像扩展角度提出问题，该角度仅需要图像数据作为输入。提出的两阶段方法根据完整体的估计范围识别新的FOV边框，并在截短区域中渗出了缺失的组织。使用在FOV中具有完整主体的CT切片对训练样品进行模拟，从而使模型开发自制。我们使用有限FOV的肺筛选CT评估了所提出的方法在自动BC评估中的有效性。提出的方法有效地恢复了缺失的组织并减少了FOV组织截断引入的BC评估误差。在大规模肺部筛查CT数据集的BC评估中，这种校正既可以提高受试者内的一致性和与人体测量近似值的相关性。已开发的方法可在https://github.com/masilab/s-efov上获得。

translated by 谷歌翻译

Skeletal Video Anomaly Detection using Deep Learning: Survey, Challenges and Future Directions

Pratik K. Mishra , Alex Mihailidis , Shehroz S. Khan

分类：计算机视觉

2022-12-31

The existing methods for video anomaly detection mostly utilize videos containing identifiable facial and appearance-based features. The use of videos with identifiable faces raises privacy concerns, especially when used in a hospital or community-based setting. Appearance-based features can also be sensitive to pixel-based noise, straining the anomaly detection methods to model the changes in the background and making it difficult to focus on the actions of humans in the foreground. Structural information in the form of skeletons describing the human motion in the videos is privacy-protecting and can overcome some of the problems posed by appearance-based features. In this paper, we present a survey of privacy-protecting deep learning anomaly detection methods using skeletons extracted from videos. We present a novel taxonomy of algorithms based on the various learning approaches. We conclude that skeleton-based approaches for anomaly detection can be a plausible privacy-protecting alternative for video anomaly detection. Lastly, we identify major open research questions and provide guidelines to address them.

translated by 谷歌翻译

Privacy-Protecting Behaviours of Risk Detection in People with Dementia using Videos

Pratik K. Mishra , Andrea Iaboni , Bing Ye , Kristine Newman , Alex Mihailidis , Shehroz S. Khan

分类：计算机视觉

2022-12-20

People living with dementia often exhibit behavioural and psychological symptoms of dementia that can put their and others' safety at risk. Existing video surveillance systems in long-term care facilities can be used to monitor such behaviours of risk to alert the staff to prevent potential injuries or death in some cases. However, these behaviours of risk events are heterogeneous and infrequent in comparison to normal events. Moreover, analyzing raw videos can also raise privacy concerns. In this paper, we present two novel privacy-protecting video-based anomaly detection approaches to detect behaviours of risks in people with dementia. We either extracted body pose information as skeletons and use semantic segmentation masks to replace multiple humans in the scene with their semantic boundaries. Our work differs from most existing approaches for video anomaly detection that focus on appearance-based features, which can put the privacy of a person at risk and is also susceptible to pixel-based noise, including illumination and viewing direction. We used anonymized videos of normal activities to train customized spatio-temporal convolutional autoencoders and identify behaviours of risk as anomalies. We show our results on a real-world study conducted in a dementia care unit with patients with dementia, containing approximately 21 hours of normal activities data for training and 9 hours of data containing normal and behaviours of risk events for testing. We compared our approaches with the original RGB videos and obtained an equivalent area under the receiver operating characteristic curve performance of 0.807 for the skeleton-based approach and 0.823 for the segmentation mask-based approach. This is one of the first studies to incorporate privacy for the detection of behaviours of risks in people with dementia.

translated by 谷歌翻译

MSI: Maximize Support-Set Information for Few-Shot Segmentation

Seonghyeon Moon , Samuel S. Sohn , Honglu Zhou , Sejong Yoon , Vladimir Pavlovic , Muhammad Haris Khan , Mubbasir Kapadia

分类：计算机视觉

2022-12-09

FSS(Few-shot segmentation)~aims to segment a target class with a small number of labeled images (support Set). To extract information relevant to target class, a dominant approach in best performing FSS baselines removes background features using support mask. We observe that this support mask presents an information bottleneck in several challenging FSS cases e.g., for small targets and/or inaccurate target boundaries. To this end, we present a novel method (MSI), which maximizes the support-set information by exploiting two complementary source of features in generating super correlation maps. We validate the effectiveness of our approach by instantiating it into three recent and strong FSS baselines. Experimental results on several publicly available FSS benchmarks show that our proposed method consistently improves the performance by visible margins and allows faster convergence. Our codes and models will be publicly released.

translated by 谷歌翻译

Brain Tumor MRI Classification using a Novel Deep Residual and Regional CNN

Mirza Mumtaz Zahoor , Saddam Hussain Khan

分类：计算机视觉 | 机器学习

2022-11-29

Brain tumor classification is crucial for clinical analysis and an effective treatment plan to cure patients. Deep learning models help radiologists to accurately and efficiently analyze tumors without manual intervention. However, brain tumor analysis is challenging because of its complex structure, texture, size, location, and appearance. Therefore, a novel deep residual and regional-based Res-BRNet Convolutional Neural Network (CNN) is developed for effective brain tumor (Magnetic Resonance Imaging) MRI classification. The developed Res-BRNet employed Regional and boundary-based operations in a systematic order within the modified spatial and residual blocks. Moreover, the spatial block extract homogeneity and boundary-defined features at the abstract level. Furthermore, the residual blocks employed at the target level significantly learn local and global texture variations of different classes of brain tumors. The efficiency of the developed Res-BRNet is evaluated on a standard dataset; collected from Kaggle and Figshare containing various tumor categories, including meningioma, glioma, pituitary, and healthy images. Experiments prove that the developed Res-BRNet outperforms the standard CNN models and attained excellent performances (accuracy: 98.22%, sensitivity: 0.9811, F-score: 0.9841, and precision: 0.9822) on challenging datasets. Additionally, the performance of the proposed Res-BRNet indicates a strong potential for medical image-based disease analyses.

translated by 谷歌翻译

MAISON -- Multimodal AI-based Sensor platform for Older Individuals

Ali Abedi , Faranak Dayyani , Charlene Chu , Shehroz S. Khan

分类：机器学习 | 人工智能

2022-11-07

There is a global aging population requiring the need for the right tools that can enable older adults' greater independence and the ability to age at home, as well as assist healthcare workers. It is feasible to achieve this objective by building predictive models that assist healthcare workers in monitoring and analyzing older adults' behavioral, functional, and psychological data. To develop such models, a large amount of multimodal sensor data is typically required. In this paper, we propose MAISON, a scalable cloud-based platform of commercially available smart devices capable of collecting desired multimodal sensor data from older adults and patients living in their own homes. The MAISON platform is novel due to its ability to collect a greater variety of data modalities than the existing platforms, as well as its new features that result in seamless data collection and ease of use for older adults who may not be digitally literate. We demonstrated the feasibility of the MAISON platform with two older adults discharged home from a large rehabilitation center. The results indicate that the MAISON platform was able to collect and store sensor data in a cloud without functional glitches or performance degradation. This paper will also discuss the challenges faced during the development of the platform and data collection in the homes of older adults. MAISON is a novel platform designed to collect multimodal data and facilitate the development of predictive models for detecting key health indicators, including social isolation, depression, and functional decline, and is feasible to use with older adults in the community.

translated by 谷歌翻译

Deep Learning Enabled Time-Lapse 3D Cell Analysis

Jiaxiang Jiang , Amil Khan , S. Shailja , Samuel A. Belteton , Michael Goebel , Daniel B. Szymanski , B. S. Manjunath

分类：计算机视觉

2022-08-17

本文提出了一种延时3D细胞分析的方法。具体而言，我们考虑了准确定位和定量分析亚细胞特征的问题，以及从延时3D共聚焦细胞图像堆栈跟踪单个细胞的问题。细胞的异质性和多维图像的体积提出了对细胞形态发生和发育的完全自动化分析的主要挑战。本文是由路面细胞生长过程和构建定量形态发生模型的动机。我们提出了一种基于深度特征的分割方法，以准确检测和标记每个细胞区域。基于邻接图的方法用于提取分段细胞的亚细胞特征。最后，提出了使用多个单元格特征的基于强大的图形跟踪算法在不同的时间实例中关联单元格。提供了广泛的实验结果，并证明了所提出的方法的鲁棒性。该代码可在GitHub上获得，该方法可通过Bisque Portal作为服务可用。

translated by 谷歌翻译

Inconsistencies in Measuring Student Engagement in Virtual Learning -- A Critical Review

Shehroz S. Khan , Ali Abedi , Tracey Colella

分类：计算机视觉

2022-08-09

近年来，虚拟学习已成为传统课堂教学的替代方法。学生参与虚拟学习可能会对满足学习目标和计划辍学风险产生重大影响。在虚拟学习环境中，有许多专门针对学生参与度（SE）的测量工具。在这项关键综述中，我们分析了这些作品，并从不同的参与定义和测量量表上突出了不一致之处。现有研究人员之间的这种多样性在比较不同的注释和构建可推广的预测模型时可能会出现问题。我们进一步讨论了有关参与注释和设计缺陷的问题。我们根据我们定义的七个参与注释的七个维度分析现有的SE注释量表，包括来源，用于注释的数据模式，注释发生的时间，注释发生的时间段，抽象，组合和组合水平的时间段，定量。令人惊讶的发现之一是，在SE测量中，很少有审查的数据集使用了现有的精神法法学验证量表中的注释中。最后，我们讨论了除虚拟学习以外的其他一些范围，这些量表具有用于测量虚拟学习中SE的潜力。

translated by 谷歌翻译

LWGNet: Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval

Atreyee Saha , Salman S Khan , Sagar Sehrawat , Sanjana S Prabhu , Shanti Bhattacharya , Kaushik Mitra

分类：计算机视觉

2022-08-08

傅立叶Ptychographic显微镜（FPM）是一种成像过程，它通过计算平均值克服了传统的传统显微镜空间带宽产品（SBP）的限制。它利用使用低数值孔径（NA）物镜捕获的多个图像，并通过频域缝线实现高分辨率相成像。现有的FPM重建方法可以广泛地分为两种方法：基于迭代优化的方法，这些方法基于正向成像模型的物理学以及通常采用馈送深度学习框架的数据驱动方法。我们提出了一个混合模型驱动的残留网络，该网络将远期成像系统的知识与深度数据驱动的网络相结合。我们提出的架构LWGNET将传统的电线流优化算法展开为一种新型的神经网络设计，该设计通过复杂的卷积块增强了梯度图像。与其他传统的展开技术不同，LWGNET在PAR上执行时使用的阶段较少，甚至比现有的传统和深度学习技术更好，尤其是对于低成本和低动态范围CMOS传感器。低位深度和低成本传感器的性能提高有可能显着降低FPM成像设置的成本。最后，我们在收集到的实际数据上显示出始终提高的性能。

translated by 谷歌翻译